66 research outputs found
A Riemannian low-rank method for optimization over semidefinite matrices with block-diagonal constraints
We propose a new algorithm to solve optimization problems of the form for a smooth function under the constraints that is positive
semidefinite and the diagonal blocks of are small identity matrices. Such
problems often arise as the result of relaxing a rank constraint (lifting). In
particular, many estimation tasks involving phases, rotations, orthonormal
bases or permutations fit in this framework, and so do certain relaxations of
combinatorial problems such as Max-Cut. The proposed algorithm exploits the
facts that (1) such formulations admit low-rank solutions, and (2) their
rank-restricted versions are smooth optimization problems on a Riemannian
manifold. Combining insights from both the Riemannian and the convex geometries
of the problem, we characterize when second-order critical points of the smooth
problem reveal KKT points of the semidefinite problem. We compare against state
of the art, mature software and find that, on certain interesting problem
instances, what we call the staircase method is orders of magnitude faster, is
more accurate and scales better. Code is available.Comment: 37 pages, 3 figure
Near-optimal bounds for phase synchronization
The problem of phase synchronization is to estimate the phases (angles) of a
complex unit-modulus vector from their noisy pairwise relative measurements
, where is a complex-valued Gaussian random matrix.
The maximum likelihood estimator (MLE) is a solution to a unit-modulus
constrained quadratic programming problem, which is nonconvex. Existing works
have proposed polynomial-time algorithms such as a semidefinite relaxation
(SDP) approach or the generalized power method (GPM) to solve it. Numerical
experiments suggest both of these methods succeed with high probability for
up to , yet, existing analyses only
confirm this observation for up to . In this
paper, we bridge the gap, by proving SDP is tight for , and GPM converges to the global optimum under
the same regime. Moreover, we establish a linear convergence rate for GPM, and
derive a tighter bound for the MLE. A novel technique we develop
in this paper is to track (theoretically) closely related sequences of
iterates, in addition to the sequence of iterates GPM actually produces. As a
by-product, we obtain an perturbation bound for leading
eigenvectors. Our result also confirms intuitions that use techniques from
statistical mechanics.Comment: 34 pages, 1 figur
Concentration of the Kirchhoff index for Erdos-Renyi graphs
Given an undirected graph, the resistance distance between two nodes is the
resistance one would measure between these two nodes in an electrical network
if edges were resistors. Summing these distances over all pairs of nodes yields
the so-called Kirchhoff index of the graph, which measures its overall
connectivity. In this work, we consider Erdos-Renyi random graphs. Since the
graphs are random, their Kirchhoff indices are random variables. We give
formulas for the expected value of the Kirchhoff index and show it concentrates
around its expectation. We achieve this by studying the trace of the
pseudoinverse of the Laplacian of Erdos-Renyi graphs. For synchronization (a
class of estimation problems on graphs) our results imply that acquiring
pairwise measurements uniformly at random is a good strategy, even if only a
vanishing proportion of the measurements can be acquired
Computational Complexity versus Statistical Performance on Sparse Recovery Problems
We show that several classical quantities controlling compressed sensing
performance directly match classical parameters controlling algorithmic
complexity. We first describe linearly convergent restart schemes on
first-order methods solving a broad range of compressed sensing problems, where
sharpness at the optimum controls convergence speed. We show that for sparse
recovery problems, this sharpness can be written as a condition number, given
by the ratio between true signal sparsity and the largest signal size that can
be recovered by the observation matrix. In a similar vein, Renegar's condition
number is a data-driven complexity measure for convex programs, generalizing
classical condition numbers for linear systems. We show that for a broad class
of compressed sensing problems, the worst case value of this algorithmic
complexity measure taken over all signals matches the restricted singular value
of the observation matrix which controls robust recovery performance. Overall,
this means in both cases that, in compressed sensing problems, a single
parameter directly controls both computational complexity and recovery
performance. Numerical experiments illustrate these points using several
classical algorithms.Comment: Final version, to appear in information and Inferenc
Fast convergence of trust-regions for non-isolated minima via analysis of CG on indefinite matrices
Trust-region methods (TR) can converge quadratically to minima where the
Hessian is positive definite. However, if the minima are not isolated, then the
Hessian there cannot be positive definite. The weaker
Polyak\unicode{x2013}{\L}ojasiewicz (P{\L}) condition is compatible with
non-isolated minima, and it is enough for many algorithms to preserve good
local behavior. Yet, TR with an subproblem solver lacks even
basic features such as a capture theorem under P{\L}.
In practice, a popular subproblem solver is the truncated
conjugate gradient method (tCG). Empirically, TR-tCG exhibits super-linear
convergence under P{\L}. We confirm this theoretically.
The main mathematical obstacle is that, under P{\L}, at points arbitrarily
close to minima, the Hessian has vanishingly small, possibly negative
eigenvalues. Thus, tCG is applied to ill-conditioned, indefinite systems. Yet,
the core theory underlying tCG is that of CG, which assumes a positive definite
operator. Accordingly, we develop new tools to analyze the dynamics of CG in
the presence of small eigenvalues of any sign, for the regime of interest to
TR-tCG
- …